Introduction:

In this project, I have chosen to analyze the datasets provided by the CDC containing average BMI for each state (containing 52 provinces/states of the United States) which gave the correlating state’s name and ranked the states. In the complementary dataset I chose, I looked through the data.gov national data base for a supplemental dataset containing information on urban and rural areas in each state that they were able to gather and the average BMI and correlating average year-long temperature in Farenheit. I felt in these datasets with the common numeric variable BMI, it would be interesting to have the ability to assess the effect of the environment (whether it be weather dependendent or situational like rural or urban areas) on overall health determined by BMI. Furthermore, identifying these areas by their state would be quite interesting to me to gauge where in the United States it is statistically most unhealthy according to the CDC, looking at all these variables.

##Joining/Merging Datasets
r = getOption("repos")
r["CRAN"] = "http://cran.us.r-project.org" 
options(repos = r)

install.packages("tidyr")
## Installing package into '/stor/home/lmc3757/R/x86_64-pc-linux-gnu-library/3.4'
## (as 'lib' is unspecified)
install.packages("tidyverse")
## Installing package into '/stor/home/lmc3757/R/x86_64-pc-linux-gnu-library/3.4'
## (as 'lib' is unspecified)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(tidyverse)
## ── Attaching packages ──────────────────────────── tidyverse 1.3.0 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(ggplot2)
BMI3<-read.csv("BMI3.csv")
BMI3<-BMI3%>%select(State.s.Average.Temperature,Urban.or.Rural,BMI)
NOBS2<-read.csv("NOBS2.csv")
mergeddata<-full_join(BMI3,NOBS2, by=c("BMI"="Obesity"))
glimpse(mergeddata)
## Observations: 127
## Variables: 5
## $ State.s.Average.Temperature <dbl> 62.8, 64.8, 64.8, 41.0, 54.2, 48.6, 60.4,…
## $ Urban.or.Rural              <fct> urban, urban, rural, urban, urban, rural,…
## $ BMI                         <dbl> 31.2, 28.9, 22.7, 18.8, 20.4, 21.7, 21.0,…
## $ National_Obesity_By_State   <int> 11, 44, 41, NA, NA, NA, NA, NA, 43, NA, 1…
## $ NAME                        <fct> Michigan, Maryland, Hawaii, NA, NA, NA, N…

After utilizing the dplyr function full_join, there were 127 rows and 5 variables, not dropping any information or data and instead filling in NA where rows had missing values to match up data with. I utilized the function of full_join in order to take into account each state’s urban and rural BMI values as well and to have enough data to substantially reflect a more average mean when I was to compare the BMIs dependent on their environment and matching state in the other dataset. This could potentially lead to many NAs I will have to filter out in the analysis of summary statistics as some rows will have missing data that does not match up with its counterpart. Additionally, I untidied and retidied one of my datasets I merged in order to avoid confusion when looking back for reference and changed my values from “Obesity” to “BMI” so it correlates with the merged dataset.

NOBS <- NOBS2%>%pivot_wider(names_from="NAME", values_from ="Obesity")%>%pivot_longer(1:51,values_to="BMI")

Wrangling: 6 core dplyr functions

mergeddata %>% mutate(BMI = as.numeric(BMI))
##     State.s.Average.Temperature Urban.or.Rural  BMI National_Obesity_By_State
## 1                          62.8          urban 31.2                        11
## 2                          64.8          urban 28.9                        44
## 3                          64.8          rural 22.7                        41
## 4                          41.0          urban 18.8                        NA
## 5                          54.2          urban 20.4                        NA
## 6                          48.6          rural 21.7                        NA
## 7                          60.4          rural 21.0                        NA
## 8                          60.3          urban 29.1                        NA
## 9                          60.4          urban 28.8                        43
## 10                         26.6          urban 27.6                        NA
## 11                         66.4          urban 30.8                        19
## 12                         64.8          rural 28.1                        NA
## 13                         50.1          rural 17.4                        NA
## 14                         42.9          rural 20.2                        25
## 15                         48.6          urban 26.0                        36
## 16                         64.8          urban 28.8                        43
## 17                         51.8          rural 18.2                        NA
## 18                         53.4          rural 17.5                        NA
## 19                         54.2          rural 23.4                        NA
## 20                         42.0          urban 27.6                        NA
## 21                         48.4          urban 21.8                        NA
## 22                         45.1          urban 18.8                        NA
## 23                         45.1          urban 17.6                        NA
## 24                         63.5          urban 26.8                        NA
## 25                         70.7          rural 28.7                        NA
## 26                         62.3          rural 25.4                        NA
## 27                         70.0          urban 31.0                        50
## 28                         48.4          rural 16.4                        NA
## 29                         59.4          rural 21.7                        NA
## 30                         63.4          urban 24.5                        27
## 31                         54.5          rural 22.3                        NA
## 32                         42.7          rural 21.5                        NA
## 33                         49.0          urban 21.7                        NA
## 34                         42.7          rural 16.8                        NA
## 35                         42.7          urban 22.4                        NA
## 36                         55.6          rural 20.3                        NA
## 37                         42.7          rural 21.9                        NA
## 38                         55.3          urban 29.9                        NA
## 39                         51.7          rural 21.3                        NA
## 40                         43.8          rural 18.8                        NA
## 41                         63.5          urban 28.7                        NA
## 42                         70.7          urban 29.0                        29
## 43                         59.0          urban 24.5                        27
## 44                         59.4          urban 22.0                        NA
## 45                         45.1          rural 20.5                        NA
## 46                         45.0          urban 20.3                        NA
## 47                         49.9          rural 19.4                        NA
## 48                         60.4          urban 26.0                        36
## 49                         64.8          urban 27.7                        NA
## 50                         57.6          rural 24.3                        32
## 51                         45.4          rural 18.5                        NA
## 52                         64.8          urban 16.2                        NA
## 53                         53.4          rural 21.3                        NA
## 54                         45.4          urban 28.0                        NA
## 55                         40.4          rural 20.7                        NA
## 56                         54.2          rural 23.0                        NA
## 57                         48.3          rural 28.6                        12
## 58                         59.0          urban 21.9                        NA
## 59                         43.1          rural 20.1                        NA
## 60                         51.8          urban 26.6                        NA
## 61                         47.9          rural 20.3                        NA
## 62                         48.3          urban 24.0                        NA
## 63                         66.4          urban 27.8                        NA
## 64                         44.4          rural 28.0                        NA
## 65                         64.8          urban 30.2                        NA
## 66                         41.2          urban 19.0                        NA
## 67                         62.4          urban 31.3                        31
## 68                         50.1          urban 23.0                        NA
## 69                         66.4          urban 25.6                        49
## 70                         65.2          rural 24.3                        32
## 71                         60.4          urban 32.0                        NA
## 72                         40.4          rural 28.4                        42
## 73                         47.9          rural 25.0                        10
## 74                         54.2          urban 29.6                        NA
## 75                         47.9          urban 22.3                        NA
## 76                         62.8          urban 29.6                        NA
## 77                         60.3          urban 22.0                        NA
## 78                         59.6          rural 21.7                        NA
## 79                         50.7          urban 19.3                        NA
## 80                         44.4          rural 18.7                        NA
## 81                         52.7          urban 22.0                        NA
## 82                         48.6          urban 21.0                        NA
## 83                         44.4          rural 15.9                        NA
## 84                         60.4          urban 16.2                        NA
## 85                         45.2          rural 16.2                        NA
## 86                         53.4          urban 26.0                        36
## 87                         62.4          rural 25.1                        48
## 88                         57.6          urban 32.0                        NA
## 89                         60.2          rural 19.3                        NA
## 90                         53.4          urban 23.0                        NA
## 91                         40.4          rural 21.4                        NA
## 92                         54.2          urban 24.2                         2
## 93                           NA                  NA                        NA
## 94                           NA                  NA                        NA
## 95                           NA           <NA> 32.4                         1
## 96                           NA           <NA> 34.6                         3
## 97                           NA           <NA> 30.7                         4
## 98                           NA           <NA> 30.7                         5
## 99                           NA           <NA> 30.1                         6
## 100                          NA           <NA> 29.2                         7
## 101                          NA           <NA> 33.8                         8
## 102                          NA           <NA> 36.2                         9
## 103                          NA           <NA> 29.8                        13
## 104                          NA           <NA> 23.6                        14
## 105                          NA           <NA> 26.1                        15
## 106                          NA           <NA> 31.4                        16
## 107                          NA           <NA> 26.4                        17
## 108                          NA           <NA> 29.8                        18
## 109                          NA           <NA> 32.4                        20
## 110                          NA           <NA> 32.1                        21
## 111                          NA           <NA> 30.4                        22
## 112                          NA           <NA> 34.5                        23
## 113                          NA           <NA> 35.6                        24
## 114                          NA           <NA> 30.1                        26
## 115                          NA           <NA> 33.9                        28
## 116                          NA           <NA> 35.6                        30
## 117                          NA           <NA> 26.7                        33
## 118                          NA           <NA> 25.3                        34
## 119                          NA           <NA> 22.1                        35
## 120                          NA           <NA> 35.6                        37
## 121                          NA           <NA> 29.5                        38
## 122                          NA           <NA> 31.7                        39
## 123                          NA           <NA> 30.0                        40
## 124                          NA           <NA> 29.7                        45
## 125                          NA           <NA> 30.0                        46
## 126                          NA           <NA> 34.2                        47
## 127                          NA           <NA> 26.3                        51
##                     NAME
## 1               Michigan
## 2               Maryland
## 3                 Hawaii
## 4                   <NA>
## 5                   <NA>
## 6                   <NA>
## 7                   <NA>
## 8                   <NA>
## 9             New Mexico
## 10                  <NA>
## 11              Illinois
## 12                  <NA>
## 13                  <NA>
## 14              Colorado
## 15          Rhode Island
## 16            New Mexico
## 17                  <NA>
## 18                  <NA>
## 19                  <NA>
## 20                  <NA>
## 21                  <NA>
## 22                  <NA>
## 23                  <NA>
## 24                  <NA>
## 25                  <NA>
## 26                  <NA>
## 27          North Dakota
## 28                  <NA>
## 29                  <NA>
## 30                  Utah
## 31                  <NA>
## 32                  <NA>
## 33                  <NA>
## 34                  <NA>
## 35                  <NA>
## 36                  <NA>
## 37                  <NA>
## 38                  <NA>
## 39                  <NA>
## 40                  <NA>
## 41                  <NA>
## 42               Wyoming
## 43                  Utah
## 44                  <NA>
## 45                  <NA>
## 46                  <NA>
## 47                  <NA>
## 48          Rhode Island
## 49                  <NA>
## 50         Massachusetts
## 51                  <NA>
## 52                  <NA>
## 53                  <NA>
## 54                  <NA>
## 55                  <NA>
## 56                  <NA>
## 57                 Idaho
## 58                  <NA>
## 59                  <NA>
## 60                  <NA>
## 61                  <NA>
## 62                  <NA>
## 63                  <NA>
## 64                  <NA>
## 65                  <NA>
## 66                  <NA>
## 67               Indiana
## 68                  <NA>
## 69            New Jersey
## 70         Massachusetts
## 71                  <NA>
## 72               Arizona
## 73              New York
## 74                  <NA>
## 75                  <NA>
## 76                  <NA>
## 77                  <NA>
## 78                  <NA>
## 79                  <NA>
## 80                  <NA>
## 81                  <NA>
## 82                  <NA>
## 83                  <NA>
## 84                  <NA>
## 85                  <NA>
## 86          Rhode Island
## 87               Vermont
## 88                  <NA>
## 89                  <NA>
## 90                  <NA>
## 91                  <NA>
## 92            California
## 93                  <NA>
## 94                  <NA>
## 95                 Texas
## 96              Kentucky
## 97               Georgia
## 98             Wisconsin
## 99                Oregon
## 100             Virginia
## 101            Tennessee
## 102            Louisiana
## 103               Alaska
## 104              Montana
## 105            Minnesota
## 106             Nebraska
## 107           Washington
## 108                 Ohio
## 109             Missouri
## 110                 Iowa
## 111         South Dakota
## 112             Arkansas
## 113          Mississippi
## 114       North Carolina
## 115             Oklahoma
## 116        West Virginia
## 117               Nevada
## 118          Connecticut
## 119 District of Columbia
## 120              Alabama
## 121         Rpuerto Rico
## 122       South Carolina
## 123                Maine
## 124             Delaware
## 125         Pennsylvania
## 126               Kansas
## 127        New Hampshire
mergeddata%>%filter(between(BMI,15.9,25)) %>% na.omit(NAME)
##    State.s.Average.Temperature Urban.or.Rural  BMI National_Obesity_By_State
## 1                         64.8          rural 22.7                        41
## 7                         42.9          rural 20.2                        25
## 16                        63.4          urban 24.5                        27
## 26                        59.0          urban 24.5                        27
## 31                        57.6          rural 24.3                        32
## 43                        65.2          rural 24.3                        32
## 44                        47.9          rural 25.0                        10
## 58                        54.2          urban 24.2                         2
##             NAME
## 1         Hawaii
## 7       Colorado
## 16          Utah
## 26          Utah
## 31 Massachusetts
## 43 Massachusetts
## 44      New York
## 58    California

In order to find how many states’ mean BMI was considered underweight/healthy by the CDC standard, I found that 6 out of the 52 provinces (including Puerto Rico and DC in the assessment) considered in the dataset of the US were considered underweight or healthy. I did this by utilizing the filter function in order to assess any states that BMI value fell under the value 25, which CDC said is the maximum healthy BMI. In my data, I distinguished the categorical variables as: urban or rural and name of state. On the other hand, the numerical variables were as follows: BMI; National Obesity by State; State’s Average Temperature.

mergeddata%>%select(contains("State")) %>%na.omit()
##    State.s.Average.Temperature National_Obesity_By_State
## 1                         62.8                        11
## 2                         64.8                        44
## 3                         64.8                        41
## 9                         60.4                        43
## 11                        66.4                        19
## 14                        42.9                        25
## 15                        48.6                        36
## 16                        64.8                        43
## 27                        70.0                        50
## 30                        63.4                        27
## 42                        70.7                        29
## 43                        59.0                        27
## 48                        60.4                        36
## 50                        57.6                        32
## 57                        48.3                        12
## 67                        62.4                        31
## 69                        66.4                        49
## 70                        65.2                        32
## 72                        40.4                        42
## 73                        47.9                        10
## 86                        53.4                        36
## 87                        62.4                        48
## 92                        54.2                         2
mergeddata%>%select(contains("n"))%>%na.omit()
##    Urban.or.Rural National_Obesity_By_State          NAME
## 1           urban                        11      Michigan
## 2           urban                        44      Maryland
## 3           rural                        41        Hawaii
## 9           urban                        43    New Mexico
## 11          urban                        19      Illinois
## 14          rural                        25      Colorado
## 15          urban                        36  Rhode Island
## 16          urban                        43    New Mexico
## 27          urban                        50  North Dakota
## 30          urban                        27          Utah
## 42          urban                        29       Wyoming
## 43          urban                        27          Utah
## 48          urban                        36  Rhode Island
## 50          rural                        32 Massachusetts
## 57          rural                        12         Idaho
## 67          urban                        31       Indiana
## 69          urban                        49    New Jersey
## 70          rural                        32 Massachusetts
## 72          rural                        42       Arizona
## 73          rural                        10      New York
## 86          urban                        36  Rhode Island
## 87          rural                        48       Vermont
## 92          urban                         2    California
mergeddata%>%mutate(average = State.s.Average.Temperature/BMI)
##     State.s.Average.Temperature Urban.or.Rural  BMI National_Obesity_By_State
## 1                          62.8          urban 31.2                        11
## 2                          64.8          urban 28.9                        44
## 3                          64.8          rural 22.7                        41
## 4                          41.0          urban 18.8                        NA
## 5                          54.2          urban 20.4                        NA
## 6                          48.6          rural 21.7                        NA
## 7                          60.4          rural 21.0                        NA
## 8                          60.3          urban 29.1                        NA
## 9                          60.4          urban 28.8                        43
## 10                         26.6          urban 27.6                        NA
## 11                         66.4          urban 30.8                        19
## 12                         64.8          rural 28.1                        NA
## 13                         50.1          rural 17.4                        NA
## 14                         42.9          rural 20.2                        25
## 15                         48.6          urban 26.0                        36
## 16                         64.8          urban 28.8                        43
## 17                         51.8          rural 18.2                        NA
## 18                         53.4          rural 17.5                        NA
## 19                         54.2          rural 23.4                        NA
## 20                         42.0          urban 27.6                        NA
## 21                         48.4          urban 21.8                        NA
## 22                         45.1          urban 18.8                        NA
## 23                         45.1          urban 17.6                        NA
## 24                         63.5          urban 26.8                        NA
## 25                         70.7          rural 28.7                        NA
## 26                         62.3          rural 25.4                        NA
## 27                         70.0          urban 31.0                        50
## 28                         48.4          rural 16.4                        NA
## 29                         59.4          rural 21.7                        NA
## 30                         63.4          urban 24.5                        27
## 31                         54.5          rural 22.3                        NA
## 32                         42.7          rural 21.5                        NA
## 33                         49.0          urban 21.7                        NA
## 34                         42.7          rural 16.8                        NA
## 35                         42.7          urban 22.4                        NA
## 36                         55.6          rural 20.3                        NA
## 37                         42.7          rural 21.9                        NA
## 38                         55.3          urban 29.9                        NA
## 39                         51.7          rural 21.3                        NA
## 40                         43.8          rural 18.8                        NA
## 41                         63.5          urban 28.7                        NA
## 42                         70.7          urban 29.0                        29
## 43                         59.0          urban 24.5                        27
## 44                         59.4          urban 22.0                        NA
## 45                         45.1          rural 20.5                        NA
## 46                         45.0          urban 20.3                        NA
## 47                         49.9          rural 19.4                        NA
## 48                         60.4          urban 26.0                        36
## 49                         64.8          urban 27.7                        NA
## 50                         57.6          rural 24.3                        32
## 51                         45.4          rural 18.5                        NA
## 52                         64.8          urban 16.2                        NA
## 53                         53.4          rural 21.3                        NA
## 54                         45.4          urban 28.0                        NA
## 55                         40.4          rural 20.7                        NA
## 56                         54.2          rural 23.0                        NA
## 57                         48.3          rural 28.6                        12
## 58                         59.0          urban 21.9                        NA
## 59                         43.1          rural 20.1                        NA
## 60                         51.8          urban 26.6                        NA
## 61                         47.9          rural 20.3                        NA
## 62                         48.3          urban 24.0                        NA
## 63                         66.4          urban 27.8                        NA
## 64                         44.4          rural 28.0                        NA
## 65                         64.8          urban 30.2                        NA
## 66                         41.2          urban 19.0                        NA
## 67                         62.4          urban 31.3                        31
## 68                         50.1          urban 23.0                        NA
## 69                         66.4          urban 25.6                        49
## 70                         65.2          rural 24.3                        32
## 71                         60.4          urban 32.0                        NA
## 72                         40.4          rural 28.4                        42
## 73                         47.9          rural 25.0                        10
## 74                         54.2          urban 29.6                        NA
## 75                         47.9          urban 22.3                        NA
## 76                         62.8          urban 29.6                        NA
## 77                         60.3          urban 22.0                        NA
## 78                         59.6          rural 21.7                        NA
## 79                         50.7          urban 19.3                        NA
## 80                         44.4          rural 18.7                        NA
## 81                         52.7          urban 22.0                        NA
## 82                         48.6          urban 21.0                        NA
## 83                         44.4          rural 15.9                        NA
## 84                         60.4          urban 16.2                        NA
## 85                         45.2          rural 16.2                        NA
## 86                         53.4          urban 26.0                        36
## 87                         62.4          rural 25.1                        48
## 88                         57.6          urban 32.0                        NA
## 89                         60.2          rural 19.3                        NA
## 90                         53.4          urban 23.0                        NA
## 91                         40.4          rural 21.4                        NA
## 92                         54.2          urban 24.2                         2
## 93                           NA                  NA                        NA
## 94                           NA                  NA                        NA
## 95                           NA           <NA> 32.4                         1
## 96                           NA           <NA> 34.6                         3
## 97                           NA           <NA> 30.7                         4
## 98                           NA           <NA> 30.7                         5
## 99                           NA           <NA> 30.1                         6
## 100                          NA           <NA> 29.2                         7
## 101                          NA           <NA> 33.8                         8
## 102                          NA           <NA> 36.2                         9
## 103                          NA           <NA> 29.8                        13
## 104                          NA           <NA> 23.6                        14
## 105                          NA           <NA> 26.1                        15
## 106                          NA           <NA> 31.4                        16
## 107                          NA           <NA> 26.4                        17
## 108                          NA           <NA> 29.8                        18
## 109                          NA           <NA> 32.4                        20
## 110                          NA           <NA> 32.1                        21
## 111                          NA           <NA> 30.4                        22
## 112                          NA           <NA> 34.5                        23
## 113                          NA           <NA> 35.6                        24
## 114                          NA           <NA> 30.1                        26
## 115                          NA           <NA> 33.9                        28
## 116                          NA           <NA> 35.6                        30
## 117                          NA           <NA> 26.7                        33
## 118                          NA           <NA> 25.3                        34
## 119                          NA           <NA> 22.1                        35
## 120                          NA           <NA> 35.6                        37
## 121                          NA           <NA> 29.5                        38
## 122                          NA           <NA> 31.7                        39
## 123                          NA           <NA> 30.0                        40
## 124                          NA           <NA> 29.7                        45
## 125                          NA           <NA> 30.0                        46
## 126                          NA           <NA> 34.2                        47
## 127                          NA           <NA> 26.3                        51
##                     NAME   average
## 1               Michigan 2.0128205
## 2               Maryland 2.2422145
## 3                 Hawaii 2.8546256
## 4                   <NA> 2.1808511
## 5                   <NA> 2.6568627
## 6                   <NA> 2.2396313
## 7                   <NA> 2.8761905
## 8                   <NA> 2.0721649
## 9             New Mexico 2.0972222
## 10                  <NA> 0.9637681
## 11              Illinois 2.1558442
## 12                  <NA> 2.3060498
## 13                  <NA> 2.8793103
## 14              Colorado 2.1237624
## 15          Rhode Island 1.8692308
## 16            New Mexico 2.2500000
## 17                  <NA> 2.8461538
## 18                  <NA> 3.0514286
## 19                  <NA> 2.3162393
## 20                  <NA> 1.5217391
## 21                  <NA> 2.2201835
## 22                  <NA> 2.3989362
## 23                  <NA> 2.5625000
## 24                  <NA> 2.3694030
## 25                  <NA> 2.4634146
## 26                  <NA> 2.4527559
## 27          North Dakota 2.2580645
## 28                  <NA> 2.9512195
## 29                  <NA> 2.7373272
## 30                  Utah 2.5877551
## 31                  <NA> 2.4439462
## 32                  <NA> 1.9860465
## 33                  <NA> 2.2580645
## 34                  <NA> 2.5416667
## 35                  <NA> 1.9062500
## 36                  <NA> 2.7389163
## 37                  <NA> 1.9497717
## 38                  <NA> 1.8494983
## 39                  <NA> 2.4272300
## 40                  <NA> 2.3297872
## 41                  <NA> 2.2125436
## 42               Wyoming 2.4379310
## 43                  Utah 2.4081633
## 44                  <NA> 2.7000000
## 45                  <NA> 2.2000000
## 46                  <NA> 2.2167488
## 47                  <NA> 2.5721649
## 48          Rhode Island 2.3230769
## 49                  <NA> 2.3393502
## 50         Massachusetts 2.3703704
## 51                  <NA> 2.4540541
## 52                  <NA> 4.0000000
## 53                  <NA> 2.5070423
## 54                  <NA> 1.6214286
## 55                  <NA> 1.9516908
## 56                  <NA> 2.3565217
## 57                 Idaho 1.6888112
## 58                  <NA> 2.6940639
## 59                  <NA> 2.1442786
## 60                  <NA> 1.9473684
## 61                  <NA> 2.3596059
## 62                  <NA> 2.0125000
## 63                  <NA> 2.3884892
## 64                  <NA> 1.5857143
## 65                  <NA> 2.1456954
## 66                  <NA> 2.1684211
## 67               Indiana 1.9936102
## 68                  <NA> 2.1782609
## 69            New Jersey 2.5937500
## 70         Massachusetts 2.6831276
## 71                  <NA> 1.8875000
## 72               Arizona 1.4225352
## 73              New York 1.9160000
## 74                  <NA> 1.8310811
## 75                  <NA> 2.1479821
## 76                  <NA> 2.1216216
## 77                  <NA> 2.7409091
## 78                  <NA> 2.7465438
## 79                  <NA> 2.6269430
## 80                  <NA> 2.3743316
## 81                  <NA> 2.3954545
## 82                  <NA> 2.3142857
## 83                  <NA> 2.7924528
## 84                  <NA> 3.7283951
## 85                  <NA> 2.7901235
## 86          Rhode Island 2.0538462
## 87               Vermont 2.4860558
## 88                  <NA> 1.8000000
## 89                  <NA> 3.1191710
## 90                  <NA> 2.3217391
## 91                  <NA> 1.8878505
## 92            California 2.2396694
## 93                  <NA>        NA
## 94                  <NA>        NA
## 95                 Texas        NA
## 96              Kentucky        NA
## 97               Georgia        NA
## 98             Wisconsin        NA
## 99                Oregon        NA
## 100             Virginia        NA
## 101            Tennessee        NA
## 102            Louisiana        NA
## 103               Alaska        NA
## 104              Montana        NA
## 105            Minnesota        NA
## 106             Nebraska        NA
## 107           Washington        NA
## 108                 Ohio        NA
## 109             Missouri        NA
## 110                 Iowa        NA
## 111         South Dakota        NA
## 112             Arkansas        NA
## 113          Mississippi        NA
## 114       North Carolina        NA
## 115             Oklahoma        NA
## 116        West Virginia        NA
## 117               Nevada        NA
## 118          Connecticut        NA
## 119 District of Columbia        NA
## 120              Alabama        NA
## 121         Rpuerto Rico        NA
## 122       South Carolina        NA
## 123                Maine        NA
## 124             Delaware        NA
## 125         Pennsylvania        NA
## 126               Kansas        NA
## 127        New Hampshire        NA

Here I played around with my variables using the dplyr function select in order to ascertain only the variables in which “n” or “State” was mentioned in the title. Additionally, I mutated a new variable called ‘average’ which made the BMI a function of State’s Average Temperature and provided a new column for each average division.

Summary Statistics for BMI:

mean(mergeddata$BMI, na.rm = T)
## [1] 25.44
sd(mergeddata$BMI, na.rm = T)
## [1] 5.209204
var(mergeddata$BMI, na.rm = T)
## [1] 27.13581
quantile(mergeddata$BMI, na.rm = T)
##   0%  25%  50%  75% 100% 
## 15.9 21.4 25.4 29.7 36.2
min(mergeddata$BMI, na.rm = T)
## [1] 15.9
max(mergeddata$BMI, na.rm = T)
## [1] 36.2
n_distinct(mergeddata$BMI, na.rm = T)
## [1] 90

Summary Statistics for State’s Average Temperature:

mean(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 53.69239
sd(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 8.90903
var(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 79.37082
quantile(mergeddata$State.s.Average.Temperature, na.rm=T)
##   0%  25%  50%  75% 100% 
## 26.6 45.4 53.4 60.4 70.7
min(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 26.6
max(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 70.7
n_distinct(mergeddata$State.s.Average.Temperature, na.rm=T)
## [1] 47

Summary Statistics for Residence Variable of Urban or Rural

mergeddata %>% group_by(Urban.or.Rural) %>% summarize('mean_BMI'=mean(BMI, na.rm=T))%>%arrange(desc(mean_BMI)) %>% na.omit()
## Warning: Factor `Urban.or.Rural` contains implicit NA, consider using
## `forcats::fct_explicit_na`
## # A tibble: 2 x 2
##   Urban.or.Rural mean_BMI
##   <fct>             <dbl>
## 1 urban              25.2
## 2 rural              21.6

In merging the data sets, I decided to assess if the average BMI was higher in rural or urban areas, indicative of health in certain environmental conditions of residence; however, some rows did not have BMI values assigned to them and therefore I had to use the forcats package that R recommended in order to change the implicit NAs within the variable ‘Urban or Rural’ to a “Missing” value which was also included in the assessment. This data ulitmately shows that the higher mean BMI belonged to Urban over Rural, while missing values showed outliers in the highest range.

mergeddata_BMIsd<-mergeddata%>%group_by(Urban.or.Rural)%>%summarize(sd_BMI=sd(BMI))%>%arrange(desc(sd_BMI))
## Warning: Factor `Urban.or.Rural` contains implicit NA, consider using
## `forcats::fct_explicit_na`
glimpse(mergeddata_BMIsd)
## Observations: 4
## Variables: 2
## $ Urban.or.Rural <fct> urban, NA, rural, 
## $ sd_BMI         <dbl> 4.407524, 3.578124, 3.509402, NA

This shows the correlating standard deviations for the assessment of the categorical variable’s BMIs.

PropUorR<-table(mergeddata$Urban.or.Rural)
PropUorR
## 
##       rural urban 
##     2    41    51
prop.table(PropUorR)
## 
##               rural     urban 
## 0.0212766 0.4361702 0.5425532

In analysing the categorical variables, there is a proportion of 0.328 rural and 0.408 urban inputs in this data frame; all other values belong to missing data.

nummerge<-mergeddata%>%select(-Urban.or.Rural,-NAME)
cormat<-cor(nummerge, use = "complete.obs")
glimpse(cormat)%>%round(2)
##  num [1:3, 1:3] 1 0.362 0.325 0.362 1 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:3] "State.s.Average.Temperature" "BMI" "National_Obesity_By_State"
##   ..$ : chr [1:3] "State.s.Average.Temperature" "BMI" "National_Obesity_By_State"
##                             State.s.Average.Temperature  BMI
## State.s.Average.Temperature                        1.00 0.36
## BMI                                                0.36 1.00
## National_Obesity_By_State                          0.32 0.07
##                             National_Obesity_By_State
## State.s.Average.Temperature                      0.32
## BMI                                              0.07
## National_Obesity_By_State                        1.00

In the first step, I created a new dataframe in order to pick only the numerical values for a correlation matrix to work (with BMI, National Obestity by State, and State’s Average Temperature). In summary statistics, some interesting things to note stood out with the mean BMI being at a value of 25.44, which shows the average BMI of the US is a little overweight. However, this goes with a standard deviation of 5.21 and the minumum BMI being 15.9 while the maximum stood out at a whopping 36.2, belonging to Louisiana. The maximum temperature was 70.7 while the minumum was 26.6. This correlated with associating higher and lower BMI. As we dive further into the analysis, I believe these two variables will show a correlation. Unfortunately, the assigned ranking system of the states did not have as much relationship with the other variables. This could be seen in the correlation matrix made, rounded to 2 decimal places, that indicates the highest correlation between BMI and state’s average temperature; essentially, further proving my hypothesis about their relationship. The lowest correlation was between BMI and national obesity by state/the ranking of state.

3 Plots of Data Visualization

# Correlation Heat Map
library(reshape2)
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
melted_cormat <- melt(cormat)
head(melted_cormat)
##                          Var1                        Var2     value
## 1 State.s.Average.Temperature State.s.Average.Temperature 1.0000000
## 2                         BMI State.s.Average.Temperature 0.3617708
## 3   National_Obesity_By_State State.s.Average.Temperature 0.3247671
## 4 State.s.Average.Temperature                         BMI 0.3617708
## 5                         BMI                         BMI 1.0000000
## 6   National_Obesity_By_State                         BMI 0.0695437
ggplot(data = melted_cormat, aes(x=Var1, y=Var2, fill=value), title(main = "Correlation Heatmap")) +  geom_tile() +ggtitle(label = "Correlation Heatmap")

In this correlation heatmap, it was difficult to assess essential relationships as there were only three numeric variables to analyze. However, the strongest value in the correlation was evidently between State’s Average Temperature and BMI as it had a value of 0.3618. On the other hand, national obesity by state and BMI had the weakest and most artificial relationship at 0.0695.

#ggplot: Density Plot
mergeddata%>% ggplot(aes(x=BMI, fill=Urban.or.Rural)) + theme(legend.position=c(.9,.7)) +
  geom_density(alpha=.75)
## Warning: Removed 2 rows containing non-finite values (stat_density).

In this data I wanted to represent the relationship between BMI and living in a rural or urban environment utilizing a density plot as the population was quite big and it consisted of averages. It visually exemplifies how urban environments tended to have more of a range, but peaked and had their maximum values well above a value of 35. Showing that it might possibly be less healthy to be living in the city than in a rural environment. The reason I included the Missing values, or “NA”s was because research always has that gray area and will not always have distinct urban or rural acclaimed residences.

#3gglpot: Linear Regression
mergeddata %>% ggplot(aes(BMI, State.s.Average.Temperature))+ geom_point(stat = "summary") + ggtitle(label = "Effect of Temperature on BMI") + geom_smooth(method=lm)
## Warning: Removed 35 rows containing non-finite values (stat_summary).
## No summary function supplied, defaulting to `mean_se()
## Warning: Removed 35 rows containing non-finite values (stat_smooth).

In this ggplot, I utilized the scatter plot function of geom_point in order to visually analyze the potential relationship between State’s Average Temperature and BMI; although there are outliers, they are minimal and the regression line set showed a slight positive correlation. This relationship suggests that the higher the state’s average annual temperature, the higher the value of its state’s average BMI.

k-means/PAM clustering: Dimensionality Reduction

install.packages("cluster")
## Installing package into '/stor/home/lmc3757/R/x86_64-pc-linux-gnu-library/3.4'
## (as 'lib' is unspecified)
library(cluster)
pam_dat<-mergeddata%>% select(-NAME, -Urban.or.Rural) %>% na.omit()

sil_width<-vector()
for(i in 2:10){
     pam_fit <- pam(pam_dat, k = i)
     sil_width[i] <- pam_fit$silinfo$avg.width}
ggplot()+geom_line(aes(x=1:10,y=sil_width))+scale_x_continuous(name="k",breaks=1:10)
## Warning: Removed 1 rows containing missing values (geom_path).

PAM clustering was used to create a clustering data set by first scaling all my numeric variables by selecting out the categorical data. In order to find how many clusters to use, we must find average silhouette width. Therefore, a line plot was created out of the average silhouette widths and it yielded 2 clusters as the highest average.

pam1<-pam_dat%>%pam(2)
plot(pam1, which=2)

The average silhouette width was found to be 0.46 after k was set to 2, indicating that the clusters were set to variables with a weak and possibly artificial structure.

install.packages("plotly")
## Installing package into '/stor/home/lmc3757/R/x86_64-pc-linux-gnu-library/3.4'
## (as 'lib' is unspecified)
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
pamclust<- pam_dat %>% mutate(cluster=as.factor(pam1$clustering))
pamclust
##    State.s.Average.Temperature  BMI National_Obesity_By_State cluster
## 1                         62.8 31.2                        11       1
## 2                         64.8 28.9                        44       2
## 3                         64.8 22.7                        41       2
## 4                         60.4 28.8                        43       2
## 5                         66.4 30.8                        19       2
## 6                         42.9 20.2                        25       1
## 7                         48.6 26.0                        36       2
## 8                         64.8 28.8                        43       2
## 9                         70.0 31.0                        50       2
## 10                        63.4 24.5                        27       2
## 11                        70.7 29.0                        29       2
## 12                        59.0 24.5                        27       2
## 13                        60.4 26.0                        36       2
## 14                        57.6 24.3                        32       2
## 15                        48.3 28.6                        12       1
## 16                        62.4 31.3                        31       2
## 17                        66.4 25.6                        49       2
## 18                        65.2 24.3                        32       2
## 19                        40.4 28.4                        42       2
## 20                        47.9 25.0                        10       1
## 21                        53.4 26.0                        36       2
## 22                        62.4 25.1                        48       2
## 23                        54.2 24.2                         2       1

After ruling out any correlating NA’s, the data was slimmed down in the ggplot to assess the cluster groupings as 2 different clusters with some outliers that overlapped.

confmat<-pamclust%>%group_by(BMI)%>%count(cluster)%>%arrange(desc(n))%>%
  pivot_wider(names_from="cluster",values_from="n",values_fill = list('n'=0))
confmat
## # A tibble: 18 x 3
## # Groups:   BMI [18]
##      BMI   `2`   `1`
##    <dbl> <int> <int>
##  1  26       3     0
##  2  24.3     2     0
##  3  24.5     2     0
##  4  28.8     2     0
##  5  20.2     0     1
##  6  22.7     1     0
##  7  24.2     0     1
##  8  25       0     1
##  9  25.1     1     0
## 10  25.6     1     0
## 11  28.4     1     0
## 12  28.6     0     1
## 13  28.9     1     0
## 14  29       1     0
## 15  30.8     1     0
## 16  31       1     0
## 17  31.2     0     1
## 18  31.3     1     0

Here, I checked the accuracy of creating 2 clusters, and they seem to be in majority within a 2 cluster range grouped by BMI due to 2 mediods.

round(sum(diag(as.matrix(confmat[,2:3])))/sum(confmat[,2:3]),3)
## [1] 0.13

Unfortunately, due to the clustering inaccuracies of the three total numeric variables, we produced a hit rate of 13%.

library(ggplot2)
pamclust %>% ggplot(aes(BMI, State.s.Average.Temperature, National_Obesity_By_State, color=cluster)) + geom_point()

This cluster graph maps out a hard to decipher relationship of clusters but the 2 clusters apparent will be used to assess the results of the data analysis. Before reading in my data, it was a thicker plot of data points and helped distinguish clusters; however, I believe the filtering out of any correlationg NAs or missing data during merging significantly reduced the data to more applicable or realistic measures in order to analyze a relationship.

library(GGally)
## 
## Attaching package: 'GGally'
## The following object is masked from 'package:dplyr':
## 
##     nasa
pam1$clustering
##  1  2  3  9 11 14 15 16 27 30 42 43 48 50 57 67 69 70 72 73 86 87 92 
##  1  2  2  2  2  1  2  2  2  2  2  2  2  2  1  2  2  2  2  1  2  2  1
pamclust %>% mutate(cluster=as.factor(pam1$clustering)) %>% ggpairs(columns = 1:3, aes(color=cluster))

In this various assorted plot diagram, I used ggpairs to find that the strongest correlation amongst the variables was between BMI and State’s Average temperature, with a value of 0.362 and the cluster plot showed the relationships separation between the factors of urban and rural to be the clearest; furthermore, the density plot correlating the urban or rural average BMIs showed visually the fact that the higher average for BMIs was consistently within urban areas over rural. The weakest correlation was between National Obesity by State and BMI, with a correlation value of 0.0695, virtually no structure.

In this 3D plot, all numeric variables averages were compared and National Obesity by State was input and BMI determined each axes’ shape. While it appears they each have their own clusters, National Obesity of each State and State’s Average Temperature were scattered within the same axis, not independent or showing a correlation with one another. However, BMI and State’s Average Temperature consistently showed a weak structure for a relationship, but yet had potential for a strong relationship if I had more data that did not compute to NA or missing for the BMI column when merged. In future research, I would opt out of including the national obesity by state variable and opt in more data to strengthen the possible relationship existing between BMI and State’s Average Temperature.

pamclust%>%plot_ly(x= ~BMI, y = ~State.s.Average.Temperature, z = ~National_Obesity_By_State, color= ~cluster,type = "scatter3d", mode = "markers", symbol = ~BMI, symbols = c('circle','x','square'))
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 18. Consider
## specifying shapes manually if you must have them.
## Warning: The following are not valid symbol codes:
## 'NA'
## Valid symbols include:
## '0', 'circle', '100', 'circle-open', '200', 'circle-dot', '300', 'circle-open-dot', '1', 'square', '101', 'square-open', '201', 'square-dot', '301', 'square-open-dot', '2', 'diamond', '102', 'diamond-open', '202', 'diamond-dot', '302', 'diamond-open-dot', '3', 'cross', '103', 'cross-open', '203', 'cross-dot', '303', 'cross-open-dot', '4', 'x', '104', 'x-open', '204', 'x-dot', '304', 'x-open-dot', '5', 'triangle-up', '105', 'triangle-up-open', '205', 'triangle-up-dot', '305', 'triangle-up-open-dot', '6', 'triangle-down', '106', 'triangle-down-open', '206', 'triangle-down-dot', '306', 'triangle-down-open-dot', '7', 'triangle-left', '107', 'triangle-left-open', '207', 'triangle-left-dot', '307', 'triangle-left-open-dot', '8', 'triangle-right', '108', 'triangle-right-open', '208', 'triangle-right-dot', '308', 'triangle-right-open-dot', '9', 'triangle-ne', '109', 'triangle-ne-open', '209', 'triangle-ne-dot', '309', 'triangle-ne-open-dot', '10', 'triangle-se', '110', 'triangle-se-open', '210', 'triangle-se-dot', '310', 'triangle-se-open-dot', '11', 'triangle-sw', '111', 'triangle-sw-open', '211', 'triangle-sw-dot', '311', 'triangle-sw-open-dot', '12', 'triangle-nw', '112', 'triangle-nw-open', '212', 'triangle-nw-dot', '312', 'triangle-nw-open-dot', '13', 'pentagon', '113', 'pentagon-open', '213', 'pentagon-dot', '313', 'pentagon-open-dot', '14', 'hexagon', '114', 'hexagon-open', '214', 'hexagon-dot', '314', 'hexagon-open-dot', '15', 'hexagon2', '115', 'hexagon2-open', '215', 'hexagon2-dot', '315', 'hexagon2-open-dot', '16', 'octagon', '116', 'octagon-open', '216', 'octagon-dot', '316', 'octagon-open-dot', '17', 'star', '117', 'star-open', '217', 'star-dot', '317', 'star-open-dot', '18', 'hexagram', '118', 'hexagram-open', '218', 'hexagram-dot', '318', 'hexagram-open-dot', '19', 'star-triangle-up', '119', 'star-triangle-up-open', '219', 'star-triangle-up-dot', '319', 'star-triangle-up-open-dot', '20', 'star-triangle-down', '120', 'star-triangle-down-open', '220', 'star-triangle-down-dot', '320', 'star-triangle-down-open-dot', '21', 'star-square', '121', 'star-square-open', '221', 'star-square-dot', '321', 'star-square-open-dot', '22', 'star-diamond', '122', 'star-diamond-open', '222', 'star-diamond-dot', '322', 'star-diamond-open-dot', '23', 'diamond-tall', '123', 'diamond-tall-open', '223', 'diamond-tall-dot', '323', 'diamond-tall-open-dot', '24', 'diamond-wide', '124', 'diamond-wide-open', '224', 'diamond-wide-dot', '324', 'diamond-wide-open-dot', '25', 'hourglass', '125', 'hourglass-open', '26', 'bowtie', '126', 'bowtie-open', '27', 'circle-cross', '127', 'circle-cross-open', '28', 'circle-x', '128', 'circle-x-open', '29', 'square-cross', '129', 'square-cross-open', '30', 'square-x', '130', 'square-x-open', '31', 'diamond-cross', '131', 'diamond-cross-open', '32', 'diamond-x', '132', 'diamond-x-open', '33', 'cross-thin', '133', 'cross-thin-open', '34', 'x-thin', '134', 'x-thin-open', '35', 'asterisk', '135', 'asterisk-open', '36', 'hash', '136', 'hash-open', '236', 'hash-dot', '336', 'hash-open-dot', '37', 'y-up', '137', 'y-up-open', '38', 'y-down', '138', 'y-down-open', '39', 'y-left', '139', 'y-left-open', '40', 'y-right', '140', 'y-right-open', '41', 'line-ew', '141', 'line-ew-open', '42', 'line-ns', '142', 'line-ns-open', '43', 'line-ne', '143', 'line-ne-open', '44', 'line-nw', '144', 'line-nw-open

## Warning: The following are not valid symbol codes:
## 'NA'
## Valid symbols include:
## '0', 'circle', '100', 'circle-open', '200', 'circle-dot', '300', 'circle-open-dot', '1', 'square', '101', 'square-open', '201', 'square-dot', '301', 'square-open-dot', '2', 'diamond', '102', 'diamond-open', '202', 'diamond-dot', '302', 'diamond-open-dot', '3', 'cross', '103', 'cross-open', '203', 'cross-dot', '303', 'cross-open-dot', '4', 'x', '104', 'x-open', '204', 'x-dot', '304', 'x-open-dot', '5', 'triangle-up', '105', 'triangle-up-open', '205', 'triangle-up-dot', '305', 'triangle-up-open-dot', '6', 'triangle-down', '106', 'triangle-down-open', '206', 'triangle-down-dot', '306', 'triangle-down-open-dot', '7', 'triangle-left', '107', 'triangle-left-open', '207', 'triangle-left-dot', '307', 'triangle-left-open-dot', '8', 'triangle-right', '108', 'triangle-right-open', '208', 'triangle-right-dot', '308', 'triangle-right-open-dot', '9', 'triangle-ne', '109', 'triangle-ne-open', '209', 'triangle-ne-dot', '309', 'triangle-ne-open-dot', '10', 'triangle-se', '110', 'triangle-se-open', '210', 'triangle-se-dot', '310', 'triangle-se-open-dot', '11', 'triangle-sw', '111', 'triangle-sw-open', '211', 'triangle-sw-dot', '311', 'triangle-sw-open-dot', '12', 'triangle-nw', '112', 'triangle-nw-open', '212', 'triangle-nw-dot', '312', 'triangle-nw-open-dot', '13', 'pentagon', '113', 'pentagon-open', '213', 'pentagon-dot', '313', 'pentagon-open-dot', '14', 'hexagon', '114', 'hexagon-open', '214', 'hexagon-dot', '314', 'hexagon-open-dot', '15', 'hexagon2', '115', 'hexagon2-open', '215', 'hexagon2-dot', '315', 'hexagon2-open-dot', '16', 'octagon', '116', 'octagon-open', '216', 'octagon-dot', '316', 'octagon-open-dot', '17', 'star', '117', 'star-open', '217', 'star-dot', '317', 'star-open-dot', '18', 'hexagram', '118', 'hexagram-open', '218', 'hexagram-dot', '318', 'hexagram-open-dot', '19', 'star-triangle-up', '119', 'star-triangle-up-open', '219', 'star-triangle-up-dot', '319', 'star-triangle-up-open-dot', '20', 'star-triangle-down', '120', 'star-triangle-down-open', '220', 'star-triangle-down-dot', '320', 'star-triangle-down-open-dot', '21', 'star-square', '121', 'star-square-open', '221', 'star-square-dot', '321', 'star-square-open-dot', '22', 'star-diamond', '122', 'star-diamond-open', '222', 'star-diamond-dot', '322', 'star-diamond-open-dot', '23', 'diamond-tall', '123', 'diamond-tall-open', '223', 'diamond-tall-dot', '323', 'diamond-tall-open-dot', '24', 'diamond-wide', '124', 'diamond-wide-open', '224', 'diamond-wide-dot', '324', 'diamond-wide-open-dot', '25', 'hourglass', '125', 'hourglass-open', '26', 'bowtie', '126', 'bowtie-open', '27', 'circle-cross', '127', 'circle-cross-open', '28', 'circle-x', '128', 'circle-x-open', '29', 'square-cross', '129', 'square-cross-open', '30', 'square-x', '130', 'square-x-open', '31', 'diamond-cross', '131', 'diamond-cross-open', '32', 'diamond-x', '132', 'diamond-x-open', '33', 'cross-thin', '133', 'cross-thin-open', '34', 'x-thin', '134', 'x-thin-open', '35', 'asterisk', '135', 'asterisk-open', '36', 'hash', '136', 'hash-open', '236', 'hash-dot', '336', 'hash-open-dot', '37', 'y-up', '137', 'y-up-open', '38', 'y-down', '138', 'y-down-open', '39', 'y-left', '139', 'y-left-open', '40', 'y-right', '140', 'y-right-open', '41', 'line-ew', '141', 'line-ew-open', '42', 'line-ns', '142', 'line-ns-open', '43', 'line-ne', '143', 'line-ne-open', '44', 'line-nw', '144', 'line-nw-open
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels